-
Notifications
You must be signed in to change notification settings - Fork 230
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
compiler: Minor tweaks for elastic code gen #2453
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #2453 +/- ##
==========================================
- Coverage 87.01% 87.01% -0.01%
==========================================
Files 239 239
Lines 44995 45023 +28
Branches 8399 8404 +5
==========================================
+ Hits 39153 39176 +23
- Misses 5109 5114 +5
Partials 733 733 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How is this affecting elastic?
Code is the same as I understand, do we get faster compilation?
@@ -687,6 +687,10 @@ def __init_finalize__(self, **kwargs): | |||
if not configuration['safe-math']: | |||
self.cflags.append('--use_fast_math') | |||
|
|||
# Optionally print out per-kernel shared memory and register usage | |||
if configuration['profiling'] == 'advanced2': |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cool!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor comment but looks fine
@@ -490,6 +491,11 @@ def _cache_key(cls, *args, **kwargs): | |||
# From the kwargs | |||
key.update(kwargs) | |||
|
|||
# Any missing __rkwargs__ along with their default values | |||
params = inspect.signature(cls.__init_finalize__).parameters |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ouch how can we end up in such a weird spot
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
try the newly added tests without this patch 😬 I don't remember the details, but basically caching bypassed because a different cache key gets computed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure but I don't get why this needs this elaborate inspect instead of just having StencilDimension implement _cache_key
and add step/spacing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Im not sure what makes you think it's due to StencilDimension? the test maybe ? but it was not just that. Maybe it emerged from there, but the problem is way more general. In fact, IIRC the issue was the presence/absence of the is_const
flag, which pops up after reconstruction but it's not part of the key (without this patch) at first instantiation
@@ -244,6 +246,20 @@ def add(self, expr, make, terms=None): | |||
self[base] = self.extracted[base] = make() | |||
|
|||
|
|||
def subs_if_composite(expr, subs): | |||
""" | |||
Call `expr.subs(subs)` if `subs` contain composite expressions, that is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo: "contains"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
noted
@@ -233,7 +233,7 @@ def make_stencil_dimension(expr, _min, _max): | |||
Create a StencilDimension for `expr` with unique name. | |||
""" | |||
n = len(expr.find(StencilDimension)) | |||
return StencilDimension(name='i%d' % n, _min=_min, _max=_max) | |||
return StencilDimension('i%d' % n, _min, _max) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nitpick: could these just be min
and max
now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
they are special python keywords so we tend not to as it's not recommended
7f99f6c
to
602c448
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some minor comments but looks fine to me
Indexed"). Instead, if `subs` consists of just "primitive" expressions, then | ||
resort to the much faster `uxreplace`. | ||
""" | ||
if all(isinstance(i, (Indexed, IndexDerivative)) for i in subs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So why can't this just be moved inside uxreplace?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because it'd contradict the API -- uxreplace performs no re-simplifications.
@@ -490,6 +491,11 @@ def _cache_key(cls, *args, **kwargs): | |||
# From the kwargs | |||
key.update(kwargs) | |||
|
|||
# Any missing __rkwargs__ along with their default values | |||
params = inspect.signature(cls.__init_finalize__).parameters |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure but I don't get why this needs this elaborate inspect instead of just having StencilDimension implement _cache_key
and add step/spacing
b28aafd
to
7aec615
Compare
via PRO :) |
No description provided.